Template-free word spotting in low-quality manuscripts

نویسندگان

  • Huaigu Cao
  • Venu Govindaraju
چکیده

As the OCR technique is not yet adequate for handwritten scripts with large lexicon, word spotting has been introduced as an alternative to OCR. This paper proposes a novel approach to word spotting that, instead of matching features of the word image to features extracted from predefined templates, uses the estimated posterior probability as the output of well trained classifier for spotting. Gabor features are extracted from gray scale image in order to yield higher performance on degraded, low quality document image.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Radial Line Fourier Descriptor for Segmentation-free Handwritten Word Spotting

Automatic recognition of historical handwritten manuscripts is a daunting task due to paper degradation over time. Recognition-free retrieval or word spotting is popularly used for information retrieval and digitization of the historical handwritten documents. However, the performance of word spotting algorithms depends heavily on feature detection and representation methods. Although there exi...

متن کامل

Local Binary Pattern for Word Spotting in Handwritten Historical Document

Digital libraries store images which can be highly degraded and to index this kind of images we resort to word spotting as our information retrieval system. Information retrieval for handwritten document images is more challenging due to the difficulties in complex layout analysis, large variations of writing styles, and degradation or low quality of historical manuscripts. This paper presents ...

متن کامل

Adaptation des caractéristiques pseudo-Haar pour le word spotting dans les documents manuscrits

This paper addresses the problem of word spotting in handwritten documents. We propose a coarse-to-fine segmentation free approach. This approach is based on two filtering phases, which are a global filtering followed by a local filtering after changing the observation scale. The contribution of this work is the use and the adaptation of the Haarlike-features in word spotting task for each test...

متن کامل

Lexicon-free handwritten word spotting using character HMMs

For retrieving keywords from scanned handwritten documents, we present a word spotting system that is based on character Hidden Markov Models. In an efficient lexicon-free approach, arbitrary keywords can be spotted without pre-segmenting text lines into words. For a multi-writer scenario on the IAM off-line database as well as for two single writer scenarios on historical data sets, it is show...

متن کامل

A Survey on Various Word Spotting Techniques for Content Based Document Image Retrieval

Searching documents for information and retrieval of relevant documents is a basic activity. Various tools are readily available for searching and retrieval from digital documents, but not much robust methods are available for retrieval from historic documents and old manuscripts as they are not digitized but available in scanned formats. Conventional way of retrieval from scanned document imag...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006